Existing directory: places project in an existing folder
New directory: creates new folder
Version control: handy if you want to use github
Projects are powerful:
R knows where to look for files
No need to worry about setting working directories
Great for sharing
Quarto
Open a quarto document in your new project
File > New File > Quarto document
Save the document within the project directory (where you already are)
Save the _quarto.yml provided in the email within this directory
Render the document
Packages
The base version of R can be upgraded with packages
We shall use the tidyverse collection of packages.
#install.packages("tidyverse")#install.packages("pander")#install.packages("patchwork")#install.packages("MetBrewer")#install.packages("ggridges")library(tidyverse) # for tidy codinglibrary(tinytable) # for nice tableslibrary(patchwork) # for aligning plotslibrary(MetBrewer) # for nice colours to use when making figureslibrary(ggridges) # for nice density plots
contains chooses columns with names that contain a pattern
3
Dangerous coding! Avoid.
Changing column names:
pterosaur_data %>%select(Specimen = Individual_ID)# if you want to keep all other columnspterosaur_data %>%select(Specimen = Individual_ID, everything())# a recommended alternativepterosaur_data %>%rename(Specimen = Individual_ID)
select() use cases
Create a new dataset that only contains the ID of the individual and wing measurements for phalanxs 2, 3 and 4.
Returning to the original data, remove the measurements for wing phalanx 2 and 4
# remove NAs in single columnpterosaur_data %>%filter(!is.na(ORBIT))# remove all rows with NAspterosaur_data %>%filter_at(vars(2:15), all_vars(!is.na(.)))
filter() use cases
Find pterosaurs that have longer necks than humerus’
Returning to the original data, remove measurements with NA TRUNK_LENGTH values, for individuals with IDs greater than 50
Trim the data to only include SKULL lengths between 60 and 90mm
Find the individuals with the maximum and minimum tail lengths
Find the individuals with tail lengths above the mean of the sampled population
filter(): handy operators
== = equal to
& = and
| = or
! = does not
> = greater than
< = less than
mutate(): modifying existing columns
Let’s change the units of measurement to centimetres
Now add 10cm to each orbit measurement (but don’t save this!)
pterosaur_data %>%mutate(ORBIT = ORBIT +10)
mutate(): creating new columns
The total length of a wing is roughly the sum of the lengths of the humerus, radius, fourth metacarpal and the four wing phalanxs. With mutate(), we can calculate this and add it to the dataset:
# A tibble: 138 × 3
Individual_ID Size_class single_wing_length
<dbl> <chr> <dbl>
1 1 Small 183.
2 2 Small 174.
3 3 Unknown NA
4 4 Small 189.
5 5 Small 166.
6 6 Unknown NA
7 7 Small 164.
8 8 Unknown NA
9 9 Unknown NA
10 10 Small 221
# ℹ 128 more rows
~ .x * 1.1 adds 10% to each element, starting from the initial value 100
accumulate() use cases
Imagine now that instead of money, you need to track the number of pterosaurs in a population. That population starts off with 400 individuals and decreases by 5% each year. Track the population for 25 years.
Now consider migration. Each year, 10 individuals enter the population from a neighbouring source population.
Visualising data
At its core, science communication is most effective through visual mediums
The ggplot2 package is included in the tidyverse
ggplot()
Build plots one layer at a time
Layers are added on top of one another
New layers are added with the + symbol
+ == %>% in ggplot-land
Getting started
pterosaur_data_classes %>%ggplot(aes())
ggplot() provides an empty canvas
aes determines how variables are mapped to visual aesthetics
Building a geom_histogram()
pterosaur_data_classes %>%ggplot(aes(x = SKULL/10, fill = Size_class)) +geom_histogram(binwidth =0.1)
Fix the labels
pterosaur_data_classes %>%ggplot(aes(x = SKULL/10, fill = Size_class)) +geom_histogram(binwidth =0.1) +labs(x ="Skull length (cm)", y ="No. individuals", fill ="Size class")
Fix the theming
pterosaur_data_classes %>%ggplot(aes(x = SKULL/10, fill = Size_class)) +geom_histogram(binwidth =0.1) +labs(x ="Skull length (cm)", y ="No. individuals", fill ="Size class") +theme_classic() +# newtheme(panel.grid.major =element_line(), # newtext =element_text(size=14)) # new